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Distances in cosmology are usually inferred from observed redshifts — an estimate that is dependent 
on the local peculiar motion — giving a distorted view of the three dimensional structure and affecting 
basic observables such as the correlation function and power spectrum. We calculate the full non- 
linear redshift-space power spectrum for Gaussian fields, giving results for both the standard flat sky 
approximation and the directly-observable angular correlation function and angular power spectrum 
Ci(z, z'). Coupling between large and small scale modes boosts the power on small scales when the 
perturbations are small. On larger scales power is slightly suppressed by the velocities perturbations 
qq ■ on smaller scales. The analysis is general, but we comment specifically on the implications for 

future high-redshift observations, and show that the non-linear spectrum has significantly more 
f"^ , complicated angular structure than in linear theory. We comment on the implications for using the 

£SJ ■ angular structure to separate cosmological and astrophysical components of 21 cm observations. 
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O 1 I. INTRODUCTION 

At the most fundamental level cosmological observations consist of measurements of radiation intensity and fre- 
43 ■ quency as a function of angle on the sky. From these we can try to infer properties of the Universe on our past 
light cone, and from them learn about cosmology. To make more than the simplest inferences we must find a reliable 
distance to the source we are observing. Fortunately if the frequency of an emitting source is known, the redshift 
can be used as a measurement of distance, allowing us to map our past light cone as a function of angle and redshift. 
The observed redshift includes several effects, but the most important is that from cosmological expansion which 
allows us to estimate the distance. Secondary to this is the doppler shifting from the peculiar velocity of the source 
along our line of sight. For measurements of the displacement of a source from us, the peculiar velocity quickly 
becomes negligible in comparison to the cosmological redshifting. However, when measuring the separation between 
J> ' spatially close points the correlated peculiar velocities can have an important effect. When inferring the statistics of 
cosmological fluctuations it is therefore important to carefully model the effect of velocities. 

The universe is assumed to be spatially statistically homogeneous and isotropic at a given time. The non-linear 
mapping between real space (measured by comoving distance) and redshift space (measured by the redshift z) means 
that a Gaussian field (with Gaussian densities and velocities) will no longer be Gaussian when observed in redshift 
space, and its power spectrum will also be different. In this paper we show how to calculate the non-linear redshift- 
space power spectrum and quantify the effects numerically. The linear result is well-known [l|, but here we 
use a non-perturbative approach to calculate results to all orders. As we shall see, the non-linear corrections can 
be important at small scales even at high redshift, and are therefore potentially important for future high-redshift 
observations. 

When the non-linear corrections become important, for full consistency one should also calculate the non-linear 
evolution of the fields: an initially Gaussian random field will be modified once non-linear growth starts to be 
perturbatively important 0, 0, Q . These non- linear effects are more complicated to model, and depend on which 
source is being observed; for example, the 21cm source evolution is quite different to that of galaxy number counts. 
In this paper we therefore neglect these complications, focussing on understanding the important implications of the 
redshift-space mapping alone, with the important caveat that our results must be generalized for application to real 
observations. Our analysis is applicable to any observable that can be reasonably approximated as having a Gaussian 
source field with Gaussian velocities, and hence, within our approximation, applies equally to biased source number 
counts or 21cm. 

Since the line of sight defines a vector field on the past light cone, the light cone as a function of redshift and angle 
is only statistically isotropic about the centre of symmetry — the observation point. The inferred angular structure 
of the field about other points therefore gives information about the local velocity field. In linear theory the velocities 
are simply related to the total density when dark matter and baryon velocities are the same. Hence an observation 
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of the velocities could be used to constrain directly the cosmological density field independently of the sources, which 
could be hard to model because of complicated astrophysics. We show that the non-linear corrections to the angular 
structure can be important when attempting to measure the densities this way. 

This paper will continue as follows: In the next sub-section ( Section II A[) we briefly overview the results from linear 
theory. In Section [TT] we introduce our method for calculating the non-linear redshift-space power spectra. Section HTll 
discusses the differences encountered when calculating the power spectrum of radiative fields like the brightness 
compared to spatial densities such as the matter perturbation. In Section IIVI we calculate the three-dimensional 
power spectrum and discuss the results. To go beyond this to the full-sky, in Section [Vl we calculate the angular 
correlation function and angular power spectrum. Finally we discuss what bearing our results have on high-redshift 
21cm observations in Section IVT1 

Throughout the rest of this paper we assume a standard flat concordance ACDM cosmology with matter densities 
£l c h 2 — 0.104, £!/,/i 2 = 0.022 for dark and baryonic matter respectively. We take a Hubble parameter of Hq = 
73kms _ Mpc~ , and optical depth to Thomson scattering t = 0.09. We use a primoridal power spectrum with 
constant spectral index n s — 0.95 and amplitude A s = 2.04 x 10~ 9 at a scale of 0.05Mpc _1 . Furthermore we neglect 
the neutrino masses which should have small effects at high k. 



A. Redshift-space mapping and linear result 

Assuming the redshift is entirely cosmological, the comoving distance to an object at redshift z is 

Xz Jo {l + z')H{z'Y K ] 

where 7i is the comoving Hubble parameter and throughout we use natural units with c = 1 . In general this equation 
defines what we call the redshift-space distance, which can easily be calculated from the observed redshift given a 
background cosmology. However it is not equal to the actual comoving distance in a perturbed universe: the peculiar 
velocity means that the actual comoving distance x at redshift z is not Xz > but also depends on the local velocity field 
v(x). Neglecting local evolution of the background, small lensing and general- relativistic effects, and assuming that 
the peculiar velocities are non-relativistic, the comoving distance is in fact 

X = Xz - v(x) • A/H\ z . (2) 

Note that we assume the peculiar velocity of the observer is removed from the observed redshifts so that only the 
source velocity matters. From here onwards we write 0(x) = v(x) ■ n/7i, and denote our coordinates in real space as 
x = xn, and redshift space as s = Xz^i such that the mapping between the two is 

s = x + (/>(x)n . (3) 

The effect at first order in the power spectrum is well known and easy to calculate [l| . Transforming the mass in a 
small volume element from real to redshift space using the Jacobian factor we have 



d 6 s = d A x 



ds 



(4) 



In the distant observer approximation we neglect the curvature of the sky, and thus the Jacobian factor contains only 
the line of sight term Tp = 1 + (f>', with the prime denoting differentiation with respect to the line of sight direction. 
We discuss this point in more depth in Section HTT1 Conserving the mass in the elements gives 

p[l + A s (s)]d 3 s = p[l + A(x)]d 3 x (5) 

and hence 

A(x) - 0'(x) 

1 + 0'(x) 

where the source perturbation in real space is A and in redshift space is A s . Expanding this to first order gives the 
redshift-space perturbation 

A s (s) « A(s) - 4>'( S ) . (7) 
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Note that in this we use the fact that s = x at first order to transform the arguments. In Fourier space we have 

A,(k) = A(k)-»A||0(k), (8) 

where fc|| = n • k. The quantity we are interested in is the power spectrum P s of A s given by 

P s (k) = P A (k) + 2ifc||P A 0(k) + fc^(k), (9) 

where Pa, Pa<^> and P^ are generated by the obvious contraction of A and tf>. For irrotational flows we can link the 
velocity vector field to an underlying scalar perturbation 5 V , defined by the relation 

V-v(x) = -H5 v (x) . (10) 

This definition applies generally and makes no constraints on our tracer A. It is, however, motivated by the continuity 
equation for pressureless matter in the linear growth era. In this regime the scalar perturbation to the velocities S v 
simply relates to the total matter perturbation S m via 5 V = fS m , where / is the derivative of the linear growth 
factor for matter perturbations, / = dhiD + / 'din a. The equivalent Fourier space definition to Eq. (|10[) is v(k) = 
Hi. (k/fc 2 ) <5„(k), so, writing = n • k = fcii/fe, Eq. ([9]) becomes 

P s (k) = P A (k) + 2/£P A „(k) + niPvQf). (11) 

If the field we consider is a linearly biased tracer of the underlying matter distribution, such as a simplistic model of 
galaxy number counts, we would have A = b5 m , with b the linear bias factor. Assuming no velocity bias this gives 
the classic Kaiser result (see [l|) 

P g , s (k) = b 2 {l + b- 1 ftifPs(k) . (12) 



II. NON-LINEAR POWER SPECTRUM 



The contributions to the redshift-space power spectrum beyond linear theory could be calculated by a perturbative 
expansion. We discuss perturbative relationships between real and redshift space in Appendices [B] and [Cj However 
as we might expect this approach becomes tedious above second order, and features independently large terms that 
nearly exactly cancel. The reason for this behaviour is that the effective displacement caused by a velocity becomes 
larger than the perturbation wavelength on small scales, so the small scale contribution to A(s) is very different from 
A(x). However most of this displacement comes from the coherent large-scale velocity field, which has little effect on 
the difference of the velocities that is important for an observable change in the correlation function. A bulk radial 
displacement is not observable in the flat-sky approximation. For this reason an approach based on transforming the 
real-space correlation functions may be significantly better. This is the approach we adopt here, which allows us to 
calculate a simple non-perturbative result for the redshift-space power spectrum. 

We would like to find how a Gaussian density field A(x) appears in redshift space. Our starting point is from the 
conservation of field mass in a small volume element, in both real-space and redshift space 

[1 + A s (s)]d 3 s= [1 + A(x)]d 3 x. (13) 

We emphasize that this is for a density field such as source counts, e.g. the galactic number density. Radiative fields 
such as the brightness and brightness temperature are different since the measurement is then of observed photon 
counts, rather than source number counts; we address this is Section [Ml With this restriction in mind we multiply 
both sides by e _ik s and integrate, finding that 

J [1 + A s (s)] e~ ika d 3 s = J [I + A(x)] e- lk s d 3 x , (14) 

and substituting s = x + n^^x) we then have 

(2-K) 3 5 3 (k) + A s (k) = J d 3 x e -' ik ' x [l + A(x)] e - lfe '^ (x) , (15) 

where <5 3 (k) is the Dirac delta-function that we can neglect provided we limit ourselves to the behaviour at k ^ 0. To 
calculate the power spectrum we use 



(A s (k) A s (q)) = / / d 3 xd 3 y e -^* + ^ ([l + A(x)] [1 + A(y)] e -*[M«+«ii*(jO]\ , 



(16) 
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where q» = q • n y , and h y = y/y. To calculate the expectation value we assume that all the fields are Gaussian. 
Writing the fields as a vector z T = (A(x), A(y), </>(x), cj>(y)) , and defining a further vector w T = —i [0, 0, kn, 311), we 
calculate the expectation values (e wTz ), (ze wTz ) and (zz T e wTz ), defined by 



.)e w ' z ) = ^75- / d^z exp 

(2tt) 2 det 1/2 C ' 



4z T rt 



(...), (17) 



where the C is the covariance matrix of the fields C = (zz T ). We complete the square in the Gaussian integral to 
evaluate it, giving 



= e ^ wTCw . (18a) 



To calculate the remaining two expectation values we take the partial derivatives with respect to w: 

= e^ wTCw Cw , (18b) 
— V,T: ' A pJ wTCw [C + Cww T C] . (18c) 



z z e 



The results of Eq. (TT5)) allow us to evaluate Eq. (Tl7j|) : we take (|18ap , the 1 and 2 components of (|18b[) , corresponding 
to A(x) and A(y), and the 1,2 component of (|18c|) . from A(x)A(y), and sum them to construct the expectation 
value of Eq. (fl~6]) . The required components are 

w T Cw = -fcj|C^(x,x) - gfC^,(y,y) - 2k\\q\\C (j> {-x.,y) , (19a) 
[C-w]i + [C-w] 2 = -i[ g ||C A 0(x,y) + fc||CA0(y,x)] , (19b) 
[C + Cww T C]i2 = C A (x,y) - /j||g||CA^(x,y)CA0(y,x) , (19c) 

where we have defined C a (,(x, y) = (a(x)6(y)). Note that statistical isotropy of the underlying correlation requires 
(A(x)v(x)) = and hence the definition of the (j> field means that Ca^(x, x) = 0. Combining the above, the 
expectation value (A s (k) A s (q)) evaluates to 



<A s (k) A fl (q)> = J J d 3 xd 3 y e -iIk-+q'y] e -|K^(x,x] +g ;^(y,y) + 2fc, g| |^(x, y )] 

x [l + C A (x,y) - ig||C A 0(x,y) - ifc||C A 0(y,x) - fc||(7||CA0(x,y)CA0(y,x)] . (20) 

This result can now be used to calculate the flat-sky power spectrum P(k) and the directly-observable angular power 
spectrum Ci{z, z'), as we show in the following sections. 

It is possible to extend this method to calculation of higher n-point functions, such as the bi-spectrum and higher 
moments, allowing investigation of the non-Gaussianity introduced solely by the redshift-space distortions. This is 
conceptually simple, we simply take further moments of Eq. (|15[) giving 

(A 5 (k 1 )A s (k 2 )---A s (k n )) = J ^l[-/ ;i . < ' k x j (j[[l + Afr)}e-^i*^ , (21) 

where we have continued to neglect the behaviour at k = 0. This can be evaluated in the same manner as above, 
though that is beyond the scope of this paper, we will limit ourselves to the power spectrum. 



III. RADIATIVE FIELDS AND THE DISTANT OBSERVER APPROXIMATION 

Both the matter density field, and galactic number density are examples of spatial densities where the conserved 
quantity we consider in the transformation between real and redshift space is the mass in a small volume element 

p s (s)d 3 s = p(x)d 3 x . (22) 

This was the line we proceeded along in the previous section. However for radiative quantities such as the brightness 
we have a subtly different result: if we radially displace a number of sources we still observe the same number, however 
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the brightness is less because we receive fewer photons from a source that is more distant. For a detector of area dA, 
receiving frequencies in a range dv about v from a source region of solid angle dfl, the brightness I v is defined by the 
energy received dE in a short time dt 

dE = I v dAdSldvdt , (23) 

or simply the brightness I v is the flux onto a detector at a frequency v from a source per unit solid angle per unit 
frequency. For radiative fields the fundamental observed quantity is l v dSl dv, the flux in a frequency range v to v + dv, 
from a solid angle dQ. The redshift is determined by the shift from the source frequency z/ , and thus the frequency 
interval dv gives the radial distance interval in real or redshift space. The conservation equation, neglecting factors 
of H, is then 

I v (s) dQds = I v (x) dfldx . (24) 

where the subtle distinction between I v (s) and l v (x) is that in the latter we remove the distortion of the frequency 
interval dv caused by the peculiar motion. In the Rayleigh- Jeans approximation (excellent for typical 21cm line 
observation) the brightness temperature is TJ,(i/) l v (? jlk^v 2 , so this result also holds for the brightness temperature. 
Using s — x + </>(x) this implies that 

a A Tb (x)-0'(x) 

which was only an approximation in the case of number counts, Eq. We discuss the perturbative expansion of 
this result in Appendix [Bj To follow the number count derivation we must take the Fourier transform, and so convert 
the small parameter space region dflds into the small volume d 3 s = s 2 dflds (similarly for real space), and hence write 
Eq. ((241) as 

[l + A s , Tb (s)]d 3 s= [l + A Tb (x)] (l + ^p) d 3 x. (26) 

If we simply follow through the analysis of Section [II] we come unstuck because of the 1 + <p/x term, which would 
make the analysis significantly more complicated (though not intractable). The simplifying solution is to apply an 
approximation that is not required in the spatial density case, the distant observer approximation. Given that we are 
observing at large distances relative to the velocity displacement </>, and that the distortions are sourced largely by 
the gradients of the velocity field, we assert that for all scales of interest 0(x)/x -C </>'(x) and set 1 + <j)/x « 1. At 
high redshift (z > 5) we find (j) lmB /x to be at most of order 10 -3 whilst 4>[. ms is consistently of order 1, so we expect 
this to be a reasonable approximation. After this it is possible to apply all the previous analysis to radiative fields 
such as At 6 as well as density fields. 

Applying the distant observer approximation not only allows us to consider radiative fields, but allows a simplifi- 
cation of the preceding analysis in all cases. Starting from Eq. (fT4|) we transform the (5-function generating term on 
the LHS by substituting in explicitly for x and writing it as 

J e~* s d 3 s = J e -<M*+fi*(*)] (\ + ^ 2 (i + 0'( x )) d 3 x. (27) 

Invoking the distant observer approximation removes the <f>/x term, and canceling off the lowest order terms on both 
sides leaves us with 

A fl (k) = J d 3 a;e~ lk - x [A(x) - <£'(x)] e" lk ^ ) . (28) 

Note that this holds for all k including k = unlike the previous formulation. For number counts this approximation 
neglects first order <f>/x terms, but for radiative fields it is actually correct to first order, neglecting only terms 
0(A(f>/x) and higher. Conceptually this is because if you radially displace a volume at redshift z in an angle dtt the 
physical volume corresponding to that dfl increases oc r 2 , giving a linear 0(<p/x) change to the number of sources (c.f. 
Ref. 0]). However by the inverse-square law the fraction of photons received from each source goes down by 1/r 2 , so 
the number of photons received is invariant at first order. To ensure that the result for number counts contains all 
the effects at first order we simply preserve the linear (fi/x term in Eq. (|27|) . 

Comparison with Eq. (f2"5)) shows that the quantity A(x) — </>'(x) is the source of redshift distortions at first order. 
Writing the redshift-space perturbation in this form makes it clear where the contributions are coming from, and 
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more obvious how it reduces to the first order result. Given its importance we will denote the first order source as 
a(x) = A(x) — 0'(x) from now on. To progress towards the power spectrum we follow the same lines as Eq. ([T5| to 
Eq. (|20|) with the only change that we average over z T = (a(x), a(y), 0(x), cj>(y)) to calculate the expectation. Finally 
we have the (A s (k) A s (q)) in the distant observer approximation 



<A s (k)A s (q)) = // cfxcPye-^+^exp 



- [kf { C$ (x, x) + qf { Ct (y , y) + 2*|| q\\ C (x, y ) 

x [C a (x,y) - fc||9||C a 0(x,y)C a 0(y,x)] . (29) 

To keep this correct for spatial densities at first order we must use a(x) = A(x) — </>'(x) — 2</>(x)/a:, giving the linear 
result without having assumed the distant observer approximation. 

IV. POWER SPECTRA ON THE FLAT-SKY 



We first consider the flat-sky approximation, appropriate for a small patch of sky sufficiently thin in redshift that 
evolution along the light cone can be neglected. The patch is assumed to be at a large distance and subtending a small 
angle so that n as n' across the patch. Since we are neglecting evolution, in a statistically homogenous universe with 
isotropy broken locally only by the line of sight direction the correlation functions should be a function of r = |x — y| 
and fi r = n • r only, so 



C A (x,y)=£A(r) 
C A 0(x,y) = £A<f,(r, fi r ) 
CV(x,y) = ^(r,Hr) ■ 



Changing one integration variable from x to r in Eq. (|20[) , we can then perform the integration over y using 

J d 3 2 /e- iy (k+q) = (27r) 3 (5 3 (k + q) . 

By definition the power spectrum is 

<A s (k)A s (q)) = (2vr) 3 5 3 (k + q)P s (k), 

and hence we identify -P s (k) as 



drr e 



-ik-r 



1 + ^A(r) + 2i*||^A</>(r, jti r ) - k^ Acf> (r, ^ r ) z 



(30a) 
(30b) 
(30c) 



(31) 



(32) 



(33) 



The correlation functions above are dependent only on the angle between r and n, and hence are azimuthally symmet 
ric, allowing us to integrate out this dependence. If we separate the exponential term as e~ lk ' r 
we can integrate over ip, and use the identity: 



e g— ik±r± cos <p 



1 f 2 * 

— / exp (— i x cos ip) dip = Jq(x) 
2tt Jo 



(34) 



where Jq(x) is the zeroth Bessel function of the first kind. Furthermore knowing that the result will be real, we can 
write separate real and imaginary parts into the cosine and sine parts of the exponential. Combining these we have 



P s (k) = 4^ / dr / ^ r r 2 Jo(fc±r v /T^2) e -' c ii [ «* (0) -^ (r '' i '' )1 
Jo Jo 



(*yr/i r ) 1 + £a(«") - kj^A4>(r, fi r ) 2 + 2k\\ sin (k\\rfx r ) £,Aj>{r, (J, r ) 



(35) 



This is the final form, suitable for numerical evaluation. Unfortunately the integral is highly oscillatory, but we must 
still include the structure in the integrand across a large range between fcMpc w 10~ 3 -10 3 . This includes a very large 
number of oscillation and thus requires careful evaluation. Calculation of the correlation functions £a, £a</> and ^ 
from the relevant power spectra is considered in Appendix [A] 
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A result equivalent to Eq. (|35|) has been derived previously in Ref. Q , but numerical calculation was not attempted 
because the focus was on low redshifts where other non- linear effects are very important. Here we calculate the effects 
at high redshift, discuss the physical origin of the various effects, and in Section |V] also generalize to the directly 
observable angular power spectrum. We also note from Section HTT1 that Eq. (|35[) can alternatively be written in terms 
of the correlation functions of the first-order source a(x) as 

/>oo f-\ 

P s (k) = 4tt / dr / dfi r r 2 Jo(k ± ryT^4) cos(fc|,r/v) £ a (r, Mr) - fcfUfc Mr) 2 e^Mo)-^./*)] . (36) 



In Figured] we compare the non- linear results at redshift 10 for two distinct values of There are two distinct 
effects taking place here: firstly at low Hk there is a suppression of power across all scales; secondly at high ^ there is 
an increase in power which overcomes the general suppression at large values of k. The effect reaches the 1% level at 
around k = 0.3h Mpc^ 1 . If we calculate the rms perturbation in spheres of half of this wavelength n/k 10.5/i~ 1 Mpc, 
we find CT10.5 ~ 0.077. Thus at this scale perturbations are still firmly linear, and this effect should be significant 
relative to any non-linear evolution. 

We can gain some insight into the physical origin of these effects by considering the leading order perturbative 
corrections, that is those second order in the power spectra. We make use of some of the results from Appendix [C] 
where we examine the perturbative expansion and second order asymptotics. 

The general suppression can be understood from the form of the perturbative result at large scales. Taking the 
result from (|C14[) . we find that on large scales the non-linear contribution (AP s (k) = P s (k) — P] ln (k)) for fully 
correlated fields is 

AP s (k) ~ -fcjj^(0)P s lin (k) . (37) 

To gain insight into this note that £</>(0) is the point line-of-sight velocity variance in Hubble units, which serves 
to wash out a large-scale mode with wavenumber k in the line-of-sight direction by a fraction O(k^ r p(0) 1 ^ 2 ) of a 
wavelength. This leads to a suppression of large-scale power. 

The expansion of the perturbative result for large k suggests a source of the small-scale boost in power: the 
superposition of large-scale modes on top of modes at that k. The contributions in Eq. (|C11[) are complicated, though 
schematically they are of the form 

AP S (k) ~ P# (k)& (0) + P a y (k)£ a0 , (0) + P a (k)^ (0) , (38) 

where we have neglected constants and angular dependence, and have approximated k dP °[f^ ~ const, x P a (k) which 
is good for large k in the tail of the spectrum. All terms are of the form power spectrum at some k times the point 
variance of another quantity from larger scales. The first term (which is essentially exact) represents the superposition 
of velocity gradients on the point redshift-space power coming from larger scales. The other terms are similar, but 
contain complicated angular behaviour which we have omitted. 

At lower redshift, when terms above second order become important, the exponent term in Eq. (|35[) becomes large 
unless r ~ 0. This leads to an exponential suppression of the coupling from larger scales, reflecting the fact that 
once small scale velocities effectively wipe out the power by line-of-sight smearing, this wins over the boost due to 
superimposing larger-scale modes. The calculation is of course not reliable in this regime due to significant non- 
Gaussianity and non-linear evolution, nonetheless the qualitative effect is well known as the Fingers of God, when 
non-linear clusters contribute significant small-scale velocities Q. An extra uncorrelated Gaussian point velocity 
variance a 2 can easily be included in our model by making the substitution £^(0) — ► £0(0) + <j 2 /3H 2 . This has the 

effect that P s (k) -► e "*W 3 « p s (k), so that power on scales smaller than the redshift-space spread are exponentially 
suppressed. This describes the effect of finite line width due to the local thermal motion when considering diffuse 
21cm, and also roughly the effect of non-linear virial motion within clusters when measuring number count power 
spectra on much larger scales. For further discussion of an approximate effective model at low redshift when non-linear 
evolution is important see Ref. 0, Q . 

In Figure [T] we also plot the contributions to the power spectra with the terms at first and second order in the 
linear spectrum subtracted off, showing the contributions missed by second order perturbation theory. In the /ifc = 1 
case these contributions are significant at higher k, being greater than 5% above k = 10 hr 1 Mpc -1 — for accurate 
calculations of the redshift-space power spectrum on small scales a fully non-linear calculation is essential. 

In Figure [5] the size of the non-linear contributions from redshift distortions at redshifts of z = 10 and z = 30 is 
compared for dark matter and 21cm brightness temperature perturbations. On small scales the boosting of power 
means that non-linear effects are increasingly important in comparison to the linear prediction. At a redshift of z = 10 
their dominance at reasonable scales means that they are potentially observationally relevant. This is still true for 
the 21cm spectra, and we discuss the consequences of this in Section IVIl 
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FIG. 1: The full dark matter power spectrum, and the non-linear contributions at redshift z — 10. We plot two values of /i, 
a small value /j, = 0.2 in the upper plot and the completely parallel case /i = 1.0 in the lower plot. The solid lines are for 
positive values, the dotted lines are negative. Whilst the non-linear contributions are negative for fi = 0.2, the contributions at 
higher fi actually boost the power on small scales. We also plot the non-linear contributions greater than second order in the 
power spectrum, this shows that second order perturbation theory is largely inadequate at high-fc and high fi, at about 5% at 
k = 10/iMpc -1 . 
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FIG. 2: The ratio of the non-linear contributions to the linear predictions for the dark matter redshift-space power spectrum 
at redshifts of z = 10 and 30, and the 21cm brightness temperature power spectrum, all for /x*, = 1.0. In all cases the non- 
linear contributions become significant at high k, whilst in the low redshift dark matter case they become dominant for k 
approximately greater than 10 /iMpc -1 . This also shows the 21cm non-linear corrections are of the same magnitude as the 
dark matter corrections. 

In Figure [3] the size of the non-linear contributions from redshift distortions is compared to that from non-linear 
growth (calculated using 3rd-order perturbation theory 0, Q). The contributions are of equivalent importance at all 
scales. 



V. ANGULAR CORRELATIONS ON THE CURVED SKY 

The redshift space power spectrum that we calculated in the previous section, like the first order result, contains an 
explicit anisotropy within the small observed volume due to the direction defined by the line of sight. Whilst useful 
for consideration of localized distortions in redshift space, we should remember that each observer in the universe 
should see a statistically isotropic light cone if the universe is statistically isotropic and homogeneous. It is the angular 
correlation between different redshifts on the light cone that is directly observable. The most natural descriptions for 
the whole sky should take this directly into account, separating out the radial distances and displacements. In this 
section we calculate the angular correlation function £ s (x,y,fj,) which correlates observations at points at redshifts z 
and z' separated by angle cos" 1 /i; and the angular power spectrum Ci(z, z') giving the correlation of multipoles I at 
different redshifts z and z' . 

Our starting point is to calculate the correlation function between positions z and z' in redshift space. This is 
achieved by taking the inverse transform of (|20|) yielding 

<A.(z)A s (z')> = JJ ^^e'[ k - z+ ^1(A s (k)A s (q)) . (39) 

This is effectively the forward and inverse transform of our starting point (usually a redundant process), we have 
required it to eliminate the unwanted ku terms. Substituting (|20p into the above (with the delta-function that was 
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FIG. 3: The ratio of the corrections due to non-linear redshift-space distortions (/x^ = 1) and non-linear evolution contributions 
to the linear dark matter power spectrum at a redshift of z = 10. The redshift distortion corrections are of roughly the same 
magnitude as those from non-linear growth at all scales, and become greater on smaller scales — both effects should be thought 
of as equally important when considering modes not orthogonal to the line of sight. 



suppressed from Eq. (|15[)) we have: 

d 3 k d 3 q d 3 x d 3 y i[ k .( z -x)+q.( z '- y )] 



<A s (z)A s (z')) = J ■ 



(27r) e 



-i[fc^C (x : x)+gjfC^(y,y)+2fe|| 9 ||C^(x,y)] 



1 + C As (x,y) - iq\\C A <j>fcy) - ik\\CA<t>(y, x) - h\\q\\CA<t>(x-, y)C A <p(y, x) 



(40) 



The term in the large square brackets above is a function only of fc|| = k- h x and rj|| = q- n yi and thus we can integrate 
out the perpendicular components of k to give the delta functions <5 2 (z^) and S 2 (z' ± ). These effectively constrain x 
and y enforcing them to be parallel to z and z' respectively. Given that redshift distortions displace only along the 
line of sight this is what we should expect. This leaves the integral: 



J (2ny 



-i[fcy C0(x,x)+gyC0(y,y)+2fc||q||CV(x,y)] 



1 + C A (x,y) - i<7||C A 0(x,y) - ik\\C A(l> {y, x) - k\\q\\C ^{x, y)C A 0(y, x) 



-1, 



(41) 



where now the vectors x = in z and y = yh' z . Conveniently this is now an integral of Gaussian form in the variables 
fey and <7|| that we can analytically evaluate. Writing these as the vector t T = {k\\,q\\), we recast the integral as 



/a r s a / / \ \ f dxdy d 2 t 
<A s (z)A s (z')) = J * exp 



--t T A W) t - iu T ■ t 



1 + C A (x,y) - it 2 C A 0(x,y) - itiC A 0(y,x) - tit 2 C A ^(x, y)C A 0(y, x) - 1, 



(42) 
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where 



The prototype for this integral is 

d 2 t exp 



u T = (x-z,y-z') , 

l^(x,y) C^y.y) 



-— t T A,/,t — iu T • t | = 2ir dot 1/2 



V e 



(43a) 
(43b) 

(44) 



Further moments can be generated by taking derivatives with respect to the vector u as done to construct (|18p . 
Putting this together, the correlation function is given by a two dimensional integral in the radial distances x and y, 



(A s (z)A s (z')) = dxdy 



-Ju AT 



27TIAJV2 



l + C*A(x,y)- [A; 1 u] 2 C A 0(x,y)- [A^ 1 u]iC A 0(y,x) 

CA^(x,y)CA^y,x) -1. (45) 



A^-A^uu^ 1 
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The result expresses the redshift-space correlation function roughly as the integral of the correlations functions against 
the Gaussian distribution of the velocities at the two points. 

Given the isotropy of the correlation functions C a (x, y) they must depend only on the lengths x = |x|, y = |y| and 
the angle between them of which we take the cosine /i = n 2 • n z /, and so we write them as C a (x, y) = £ a (x,y,ii). 
Similarly (A s (z) A s (z')) depends only on z, z' and y, and we write it as £, s (z, z', /i). So in its final form the correlation 
function is 



£ s (z, z',y) = — J dxdy det 1/2 A^ exp^--u T A^ ) 1 u 

x l+^A(x,y,f-i) - [A7 1 u] 2 CA^(a;,y,M) - [At/u^a^ x, y) 



AT 1 — A7 1 uu T A7 1 



- 1 



(46) 



This closed form expression completely describes the non-linear redshift-space distortions and unlike the flat sky 
approach we have yet to make any assumptions about the change along the light cone. This ensures it is easy to 
incorporate the evolution of the fields and the background 0. A similar result, specific to the flat-sky was found in 

The correlation function is frequently used in the study of baryon acoustic oscillations (BAO) to describe the 
distortions observed on small patches of sky. There it is conventionally denoted £(c, 7r), correlating points separated 
by a comoving distance along the line-of-sight of ir and perpendicular to it er, where the curvature of the sky is 
neglected. This gives a total separation r = \/ir 2 + a 2 , and we place the points an average distance z from the origin. 
We can calculate the non-linear equivalent in the flat-sky by picking z, z' and [i equivalent to a, ir and z: 



z= y/1 + (a/2z) 2 (z + tt/2) , 



z 1 = ^l + (a/2z) 2 (z-n/2) 
a 



2 tan" 1 ( — 
\2z 



(47a) 
(47b) 
(47c) 



Figure U shows £ s (cr, ir) and the difference between the linear and non-linear results, A£ s (cr, 7r) = £ s (<7, 7r) — 7r), 
for the exactly parallel and perpendicular cases, calculated by the above procedure. We discuss how to calculate the 
flat-sky linear correlation function £ Q (er, ir) in Appendix [XJ As in the previous cases the non-linear effects change 
the correlations on small scales by significant amounts (around 10%), though the effect for the parallel case is much 
smaller than the perpendicular. In the parallel case there is a smoothing of the acoustic peak, resulting in a small 
suppression of around 3%. 

The distortions introduced on the full sky are perhaps most conveniently described by the the angular correlation 
function, giving the correlation of multipoles on different redshift slices. The l-th multipole moment Ci{z, z') is found 
by integrating with Vi(p), the l-th Legendre polynomial, that is 



Ci{z,z') = 2ir / d/*Pi(/*) 



(48) 
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FIG. 4: The redshift-space correlation function £ s (<7, n) at a redshift z = 10 for Dark Matter. The top panel illustrates the full 
correlation function, £ s (cr, n), and the non- linear contributions to it, A£ 3 (cr, n), in the parallel direction. The lower panel the 
same, but in the perpendicular direction. The acoustic peak can clearly be seen at a comoving scale of around 100 ft" 1 Mpc. 
The sharp peaking in the non-linear contributions above 10 ft -1 Mpc is largely due to the smoothing effect on the acoustic 
peak, and small perturbations around the zero crossing points that are large relative to the linear result. 



Substituting (j4"5)) gives the final integral for the angular correlation (at I > 0) for redshifts z and z'\ 

1 ( 1 



Ci(z,z') = 2it / d/idx dy Vi(fi) 



- exp u A , u 

2^det 1/2 A (i P \ 2 



AT 1 — AT 1 uu T A _1 



12 



(49) 



In getting to this result we have avoided most of the common assumptions made when considering redshift-space 
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FIG. 5: The equal redshift dark matter angular power spectrum for z = z' = 10. We plot the redshift space power spectrum 
from the linear theory prediction, the non-linear result of Eq. (|49[) and the difference between the two. The correction is 1% at 
I ~ 520 and becomes greater than 10% above I « 11000. 



problems, non-evolving field statistics and the distant observer approximation (at least for density fields like the 
matter perturbation, and source number counts). This ensures it naturally incorporates any large angle geometric 
effects that are not included by taking the flat-sky power spectrum onto the full sky. For further discussion of this 
see Ref. d,[Ti|. 

The correlation functions ^ a (x,y,fj.) encapsulate all the information required to calculate the power spectrum, and 
our formulation above remains completely general. To construct the correlations we must consider several effects, 
notably the underlying matter correlations and growth along the light cone. In Appendix [A] we consider how to 
calculate the correlation functions. 

If choosing to use the distant observer approximation, or dealing approximately with radiative fields such as the 
brightness temperature, we can follow through the same analysis above but starting from the contents of Section [TTT1 
This leads to the notationally simpler result 



Ci(z,z') — 2n / d/idxdy 



u'a; 



27rdet 1/2 A,, 



12 



(50) 



Note that this is exact at lowest order, only dropping 0{<j)/x) curved-sky terms at higher order, provided we use the 
correct forms of a for radiative and spatial fields. 

In Figure [5] we plot the redshift-space dark matter power spectrum for slices of zero separation at a redshift of 
2 = 10, comparing the fully non-linear result to the linear theory (described in detail in [12|). The linear result is 
essentially the generalisation of the Kaiser result onto the full sky, taking the form 



Ci(z,z') = - 



dk V 



ji(kz) 3l (kz')P A (k) - [ji(kz)jl'(kz') + jl'(kz)Mkz'j\P± v (k) + j' l '{kz)j' l '(kz')P v (k) . (51) 



At large I we get a boost in power over the linear-theory results as we would expect from the previous discussion 
on the flat sky. The effect at small I is less than 1 %, though this is significantly more than the effect on the power 
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spectrum at equivalent wavenumbers — the lack of intrinsic power at large scales means that the large scale signal 
in the angular power spectrum is primarily sourced from much higher wave numbers where the non-linear effects are 
greater. The increases on large scales are a consequence of this with a possible contribution from including the distant 
observer terms, though we have not disentangled their relative importance. Figure [5] does not obviously show the 
acoustic peaks, this is a consequence of the fact we do not include a window function in z — the narrow band tends 
to smooth out such features. 



VI. COMPONENT SEPARATION FOR HIGH REDSHIFT 21CM OBSERVATION 

The observation of neutral hydrogen through the 21cm spin-flip transition provides a unique opportunity for probing 
the high-rcdshift Universe. In principle observations can give a three-dimensional view of structure in the Universe 
from a redshift of z = 300 all the way down to the epoch of reionization at around z — 6 and below. The signal 
seen in absorption at z > 30 is expected to be nearly linear, with significant redshift distortion [T3 |. and containing 
angular structure down to the baryon pressure-support scale [Til fig . [T^ |. With so many modes cosmology could 
be constrained to very high precision. Although nearly-linear, small non-linear effects will still be very important if 
observations are to be used reliably, so a non-linear treatment of redshift-distortions will be essential. At redshifts 
below z < 30 the signal is expected to become much more complicated due to the presence of Lyman-a photons and 
ionizing sources. Learning about cosmology from these observations would require detailed modelling of complicated 
and poorly understood astrophysics (see Ref. [l7| for a review). Likewise source number counts (in 21cm or otherwise) 
are hard to model reliably due to scale and time-dependent bias. However in both cases the velocities are likely to be 
much closer to linear theory, making them a much more robust probe of the underlying cosmological perturbations. If 
redshift distortions can be isolated, they therefore represent a powerful way to learn about cosmological perturbations 
from present and near- future observations (e.g. see recent work in Refs. [3 [HI and references therein). 

The quantity we are interested in for 21cm observations is the brightness temperature Tf,, with perturbation A^- 
In real space this is given approximately by 

A Tb = AA + fi x 5 x +(3 a 5 a + (3 Tk S Tk , (52) 

where S}, is the baryon perturbation, S x the ionization fraction perturbation, 5 a the Lyman-a coupling perturbation, 
and 5t k the perturbation in the gas kinetic temperature. The /3, depend on the background evolution, for a more 
detailed overview see Ref. [20J . Note that throughout this section we return to the flat-sky approximation. 

Although the astrophysics that affects the 21cm signal is very interesting in its own right, to constrain primordial 
perturbations more directly we would like to determine of the power spectrum of matter perturbations P$(k). Unfor- 
tunately At" 6 mixes the astrophysical information from the ionization fraction, Lyman-a coupling and gas temperature 
in with the cosmological information we desire. However redshift-space distortions add in further information directly 
linked to the matter perturbations in the approximation in which the source velocities follow the linear CDM velocity. 
The linear redshift-space power spectrum can then be written 

P s , n (k) = P Tb {k) + 2ixlP Tb Ak) + 4 p v(k) , (53) 

where the PT b (k) is the power spectrum of brightness temperature fluctuations in real space encapsulating all the 
correlations and cross-correlations of Eq. (|52")h The term Px i}V (k) gives the cross-correlation with the velocity pertur- 
bation 5 V . At linear order we see that the \x\ contribution is entirely the matter power spectrum, giving a pos sible 
method of separation without needing to understand the detailed physics encapsulated in Pr b (k) and PT b .v(k) [2U[22j|. 
However this approach is reliant on the use of the linear expansion: as we can see in Eq. (|35p the full angular behaviour 
is much more complicated, and does not lend itself to an easy separation in powers of . So we should expect this 
naive separation method to perform badly wherever the non- linear contributions are important. 

To test this in an ideal case, we calculate the theoretical dark matter power spectrum in redshift space at a redshift 
z = 10. Taking 100 points equally spaced in fik we integrate with ^(/i), the fourth Legendre polynomial, to isolate 
the /x| contribution. With the appropriate normalization, our estimator, exact within linear theory, is 

315 f 1 

Pv{k) = — J dfi k P s {k, fi k )V 4 ^k) ■ (54) 

We compare the underlying power spectrum with that recovered via this method in Fig. [6] The recovered power 
spectrum is artificially high at large k. Repeating this with a power spectrum generated from the linear result 
as expected reproduces the input exactly. In Appendix [C] we calculate the leading-order non-linear correction on 
small scales, which shows that we have a direct fxf. contribution taking the form /i^P„(fc)^ Q (0). This combines the 
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FIG. 6: The input real-space matter power spectrum at z = 10 compared to that recovered via the estimator P v given by 
Eq. (|54p . We include the errors (shading) for a Hubble volume sized survey at z = 10 assuming a binning of Ak/k = 0.1. The 
estimator error corresponds to the error if we used the estimator P v discussed in the text. We also plot the intrinsic error that 
would be seen if we could measure the modes S v directly. At high k the recovered spectrum differs dramatically from that input 
due to the importance of higher-order terms, giving a significant systematic bias outside of the statistical errors. 



power spectrum we desire with the source point variance of large scales, mixing in information from the large-scale 
astrophysics, and is a significant contributor to the bias of this estimator. Correct interpretation of high-redshift 
observations on small scales will therefore require a more sophisticated analysis that accounts for the complicated 
angular behaviour introduced at non-linear order, or modelling of the astrophysics in a realistic and accurate manner. 

To assess whether any bias is significant, we can calculate the variance of this estimator given a few assumptions 
about the density of the sampling we can perform in k-space. We assume a survey of a large volume of the universe 
V centered at a redshift z, that has a small angular span such that we are still in the flat sky. We define an estimator 
for the power spectrum at a wavenumber fc, and line of sight angle cos -1 /i, that using a suitable weighting function 
W\c(k, jU) is defined by 

P s (fc,/i) =^™ k (/c, A1 )|A k | 2 , (55) 

k 

where the summation is over all the samples in Fourier space. We are free to choose any weighting function such that 
the ensemble average (P s (k,/j,)) — P s (k,[i). Calculating the /i-covariance of this estimator we find 

(AP s (fc,Mi)AP s (fc,Ai 2 )) = 2^ Wk (A:, A11 )u. k (A : , Ai2 )P s (k) 2 . (56) 

k 

From ([51)) the variance of the estimator P v is given by 

(AP„(fc) 2 ) = / d Ml ^ 2 P 4 ( Ml )P4(M2)<AP s (fc,Mi)AP s (fc,/X2)) (57) 
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Ideally we would optimise the weights w^k^/j.) to minimize the variance of P v , but for our purposes it will suffice to 
pick a representative form — averaging in bins of width Afc and A/j,. This picks out k 2 AkAfiV/ (2tt) 2 = n(k, /i) AfcA/i 
modes and we assume that our samples in fx are spaced widely enough that the summation of ([56]) contributes only 
when /ii equals /^2, giving 

fl/n(k,n)AkAfx |k| e [fc, k + Afc] , fi ■ k e [//, y. + A M ] 
otherwise 



Given the finite samples in \x we can draw, we approximate the integrals of (|57|) into summations 

)922 
128 



AP„(fc) 2 ) « ^-^^(A M ) 2 P4(^)^4(^)E^(^^Vk(fc,^)P s (k) 2 . (59) 



y k 

Substituting for Wkf/c, /i) connects the summations over i and j, and writing the density of modes with a wavevector 
of length k as n(k) = 4nk 2 V/(2n) 3 we have 

AP,(fc) 2 ) « 99 6 2 f n(fc ) Afc E( A ^^) 2 ^(fc,^) 2 ■ (60) 

At low-fc the Kaiser result is a reasonable approximation, and thus we use this to calculate the variance. Taking the 
continuum limit of the summation, we can perform the angular integral analytically for fields with linear bias. The 
lower bound for the error is the unbiased tracer 6=1 giving the numerical result 

APJk) 1 . , 

„ vy ' « 50 . (61) 

P v {k) y/n(k)Ak 

This shows that the errors in calculating the underlying velocity power spectrum by this component separation are 
around 35 times larger than those we would find if we could directly measure velocity modes within the observed 
volume. This increases the lowest k we could infer by around a factor of 10. The plot in Fig. [6] illustrates the dark 
matter tracing case for which 6=1 and the errors are exact. For 21cm we expect to find a large bias and thus the 
errors are dominated from the contribution of the variance of the Pa(&) term. Asymptotically, for large bias 

« 19^_ . (62) 
P v (k) v/n(fc)Afc 

To overcome this [23[ suggest that combining multiple tracers with distinct biases may be able to reduce this error 
down closer to the intrinsic level. Though obviously useful for lower redshift surveys where many indepedent tracers 
can be found as different galaxy populations, they suggest it may be possible to use this for 21cm observations by 
applying certain non-linear transformations to the observed field. This method, however, is dependent upon the linear 
result being correct, restricting its applicability to large scales. 

One further ramification is that the higher order angular effects from the non-linear distortions blur any distinction 
between the Alcock-Pacynski (AP effect) and those of redshift distortions. This may produce complications for 
methods that seek to obtain cosmological constraints through the AP effect [24| . Generally these provide constraints 
by tuning parameters until angular dependence of /i e and above is eliminated (which is zero for linear redshift-space 
distortions). However at large fc the non-linearities in redshift space ensure that even in the correct cosmology, 
contributions from higher powers of fi will be non-zero and tuning them to zero would be introducing errors in the 
parameter fitting. This significance of this is unknown, it may or may not be that the nearly linear low fc modes are 
sufficient to produce constraints unfettered by this. 



VII. CONCLUSIONS 



We have shown how to calculate the non- linear effects of redshift distortion on the power spectrum in the approximation 
of Gaussian fields. On small scales the non-linear contributions are important for modes with a component along 
the line of sight, even at high redshift. Superposition of small-scale power on larger-scale linear modes gives a boost 
in power on small scales comparable to that from non-linear structure growth. On larger scales smearing by small 
scale velocities leads to a suppression of power. Any future attempt to extract precision cosmology from high-redshift 
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observations will need to account carefully for these effects. In order to suitably describe the behaviour on the full- 
sky we also extended our technique to allow calculation of the angular correlation function and power spectrum. 
These both have the advantage of naturally incorporating evolution effects of the background and the fields involved, 
provided they remain Gaussian. 

For a fully consistent analysis the non-linear growth and non-Gaussianity should also be accounted for though at 
present our work does not yet allow this. Despite this we have already demonstrated that just the non-linearities 
introduced by the mapping from redshift to real space significantly complicate any plan to make accurate measurements 
of cosmological perturbations by looking for the angular structure in the redshift-space signal from our light cone. 

In addition to having a significant effect on the power spectrum and correlation function as discussed in this paper, 
redshift distortions will also introduce non-Gaussianity. For example there is a non-zero bispectrum for modes that 
are not all orthogonal to the line of sight. In the approximation of underlying Gaussian fields the method developed 
in this paper extends straightforwardly to higher n-point functions. This signal will have to be accounted for at high 
accuracy (along with the bispectrum introduced by non-linear growth) when attempting to use future high-redshift 
observations to constrain primordial non-Gaussianity [H, [26| . 

Our work could also be extended to include lensing, which in the Gaussian approximation is just another correlated 
random field that perturbs points orthogonal to the line of sight. 
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APPENDIX A: EVALUATING THE CORRELATION FUNCTIONS 

To calculate the redshift-space power spectrum we must be able to compute the correlation functions £a> £,Ac/> and 
£0 in terms of the matter power spectrum. To start, we note that the 3d-Fourier transform of a radially symmetric 
function can be simplified dramatically to a Id transform 

J (2^ glk ' r/(fc) = 2^ I dkMkr) [fc2/(fc)] ' (A1) 

where jo(x) = sinx/x is the zeroth spherical Bessel function. We can generalize this to encapsulate the integrals we 
will require later on. Expanding in terms of Spherical Harmonics we use the identities for e lk r , and (n • k) n 

e tk T = ( kr ) Y i*m $)Ylm (?) , (A2a) 

(n • k)« = ^E (n _ 0!!( : ! + , + 1)! , ^(")^W ■ (A2b) 

lm 

where n is a direction of our choosing. With these we can easily evaluate integrals of the form 

/ ■ = 5? £ VSfrff' + !)■■ *'* ■ f) /."* [ * ,/w]A( * r) • (A3) 

where we have used the orthogonality and addition relations of the Spherical Harmonics. For any n the summation 
only has non-zero elements as far as I = n, this means for the small n we are considering the summations will be 
limited to only a few terms. 

Our first assumption is that A is a statistically isotropic and homogenous scalar (for example the density pertur- 
bation). Secondly we stay with the definition of 5 V from Eq. (|10[) . As a reminder, in real space this relates cf> and 5 V 
via 

V • v(x) = -W<y„(x) , (A4) 

where V~ 2 is the inverse Laplacian operator. For observations tracing the underlying matter distribution, S v = fS m 
exactly in the pressureless limit. We will use the Fourier space equivalent 



v(k) =t«ptf„(k) 



(A5) 
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We will eventually express the correlations in terms of transforms of the power spectra defined by 

(A(k;z x )A(q;z y )) = (27r) 3 <5 3 (k + q)P A (fc; ) , (A6a) 
(A(k;^ x )^(q;^)) = (27r) 3 5 3 (k + q)P Al ,(fc; ) . (A6b) 
(S v (k; z x )S v (q; z y )) = {2ir) 3 S 3 (k + q)P v (k; z x , z y ) , (A6c) 

which correlate Fourier modes at different epochs given by the redshifts z x and z y . In linear theory we can write these 
in terms of the transfer functions T and the primordial power spectrum P x 

P A (k; z x , z y ) = T A (z x , k)T A (z y ,k)P x (k) , (A7a) 
PAv(k; z x , z y ) = T A (z x , k)T v (z y , k)P x (k) , (A7b) 
Pvik] z x , Zy) — T v (yZ x , k)T v (z y , k)P x (k) . (A7c) 

Numerical calculation of the power spectra can be done via codes such as CAMB 27], or for 21cm perturbations 
CAMB Sources [H]. 

The correlation functions can be written in terms of the correlations of A and S v . Denoting 0(x) = V _2 <5 tI (x) for 
brevity, they are 

C A (x,y) = (A(x)A(y)) , (A8a) 

C A ^(x,y)=y i (A(x) Ui (y)) , (A8b) 

C^(x,y) = anyj (ui(x)uj(y)) . (A8c) 

This reduces the problem down to calculating (A(x)ui(y)) and (vi(x)Vj(y)). Given the statistical homogeneity and 
isotropy, these can be decomposed into an isotropic function of the separation r = |x — y| combined with the admissible 
angular factors constructed from r. 

(A(x)A(y)) = A(r) (A9a) 
(A(x)«i(y))=WB(r)^ (A9b) 
(Vi(x)vj(y)) = H 2 [C(r) % + D(r) f {t f j} ] (A9c) 

where we add the factors of TL for later convenience. (A(x) A(y)) is scalar function and is simply the transform of the 
power spectrum P A 



A(r) = ^ I dkj {kr)k 2 P A {k;z x ,z v ) , (A10) 



where we leave the z x ,z y dependence implicit. The other correlation functions are more complicated. There is only 
one possible direction the vector correlation function (A(x)^(y)) can lie along, the separation vector r. Multiplying 
by another fj and contracting, we explicitly find B(r) by substituting substituting the Fourier transform and relating 
this to the cross power spectrum of A and S v 



B(r) - (A(x)f ■ v(y)) /H 
d 3 k ik . r ik ■ r 



1 P&vik] Zx-) Zy) 



(2tt) 3 k 2 
1 f°° 

= -—— dkj\(kr)kP A v{k;z x ,Zy) . (All) 
Zn Jo 

Then correlation of (ui(x)wj(y)) forms a rank-2 tensor that we separate into an isotropic part C(r) and the traceless 
outer product of fj and fj given by D(r). Taking the trace isolates C(r) and along the same lines as above we find 

C(r) = i(v(x).v(y))/W 2 
If d 3 k lk . r 1 

~ 3 J (27T)3 e k 2^,Z x ,Z y ) 
1 1 f 00 

= 7^ dkjo(kr)P v {k;z x ,z y ) . (A12) 
3 2tt z J 
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Finally we calculate the traceless part D(r) 



D{r)= i -{v l {s)v J (y)){f l f ] - l -8 l] )/n 2 



2 J (2tt)3 

i r° 

2^ Jo 



d 3 k - k 1 

6 ttt P v (&'") %xi %y) 



k 2 



dkj 2 (kr)P v (k; 



(k-f 



t.\'2 



With these functions calculated we can now express the correlation functions in terms of them 

C A (x,y)=A(r) , 
CA</,(x,y) = n v B(r) , 



C(r)/i xy + D(r)((i x (i y - -fj, xy ) 



(A13) 



(A14a) 
(Al4b) 

(Al4c) 



These results are general, to neaten up the notation somewhat we specialize them to the flat and curved sky cases 
we have considered. For the flat-sky x = y = n, and so /i x = fj, y = fi r and \± xy = 1. Evolution along the light cone is 
also neglected so we evaluate the power spectra at a single fixed redshift z giving 



U(r) = A(r) , 



C(r) - -D(r) 



(A15a) 
(A15b) 

(A15c) 



For the curved sky, the correlation function is dependent only on the radial distances of the points, and the angular 

separation about the origin n = fi xy . In terms of these r = (x 2 + y 2 — 2xyfx) , fi x — (y(i — x)/r and \i y = (y — x/j,)/r 
leaving 



^A(x,y,fi) = A(r) , 
^(x,y,n) = (J. v B(r) 



^(x,y,fi) = fx 



C{r) - -D(r) 



H x H y D{r) 



(A16a) 
(A16b) 

(A16c) 



In order to calculate the flat-sky linear redshift correlation function £ Q (r, /i r ), we transform the linear rcdshift-space 
power spectrum P a (k), where as we defined earlier a = A — </>', the linear perturbation in redshift space. Transforming 
Eq. (jlip term by term, again using Eq. (|A3p . we end up with the following 



(A17) 



where we have defined the correlation-like functions £A (r) by 

1 



t , , dkk 2 P a {k)j n {kr) 



(A18) 



To use the standard form £ Q (cr, 7r), we simply set r = \f cr 2 + it 2 and fi r = n/r 



APPENDIX B: PERTURB ATIVE SERIES EXPANSION 



In this Appendix we discuss the perturbative expansion of Eq. ^ : 

A(x)-0'(x) 
l + 0'(x) 



A., (si 



(Bl) 
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where x = s — 0(x). This equation is exact for radiative fields but uses the distant-observer approximation for number 
counts. To solve this implicit equation for A s we turn to the Lagrange Reversion Theorem 1 that will give us the 
result in terms of a series expansion. The theorem states that if we have an implicit definition for v = x + yf(v) then 
the function g(v) is given by the series 



00 k £ik — l 

g{v) = g{x)+ Y,li — (f(x) k 9'(x)) 

k—1 



(B2) 



To obtain A s (s) we make the obvious assignments to obtain 



A s (s) = 



A- 



1 



E 



1 d k 



fc-i 



s k 



-J fc! dx 



k-l 



. d /A 



dx \ 1 + <t>' 



(B3) 



Expanding (1 + 4>') 1 = ___™( — ^O"* an d grouping terms of order n + 1 this simplifies to 

(-1)" d n 



A s (s)= £ 



nl dx 1 



KA -<j>')r\ 



(B4) 



Perturbative results can be obtained using this series expansion, though the perturbative result for the power spectrum 
is actually obtained more straightforwardly by expansion of the non-perturbative result as we show in Appendix [C] 
The series result can also be written with un-grouped terms as 



A)^] 



°" I 1 \n fin 

Fourier-transforming Eq. (IB4[) we have 

00 1 p an 
n=0 ' J 



(B5) 



d 3 x e- 4k - x [A(x) - <j)' {*)]e- lk ^\ 



(B6) 
(B7) 



which recovers Eq. (|28[) of the main text. In the second line we dropped curved-sky corrections from the radial 
derivatives of x 2 that arise when integrating by parts, which is consistent at linear but not at higher order. 



APPENDIX C: PERTURBATIVE RESULT FOR THE REDSHIFT SPACE POWER SPECTRUM 

1. General Expansion 

Given that we have a general method for calculating the full non-linear result, a perturbative result is perhaps a little 
crude, however it provides some insight into the source of the most important non-linear effects. We develop the 
perturbation series from our result for the flat-sky spectrum in terms of the first order source a, 



First we expand the exponential 

P s (k) = fd 3 re- ihr 



(CI) 



(C2) 



1 See e.g. http://en.wikipedia.org/wiki/Lagrange_reversion_theorem 
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and then we re-sum the term in £ Q such that each term in the overall summation contains contributions from the 
same order in the correlation functions 

P s (k) = P a (k)+^J TT ^ yd 3 re- jk - r ^(rK4r)-^(r)^(O)-(n + l)^(r) 2 J(^(r)-e0(O)) . (C3) 

The power spectrum P a can be written in terms of the power spectra of A and 5 V , and similarly for the power spectra 
of Pad, and Pa: 



Pa(k) = P A (k) + 2^ 2 k P Av (k) + niP v (k) , 
P Q0 (k) = -i^ [P Av (k) + n 2 k P v {k)] , 

P,(k) = ^P.(fc) . 



(C4a) 
(C4b) 

(C4c) 



Using the convolution theorem we turn the Fourier transform of the products of correlations into a convolution of the 
corresponding power spectra, giving 



p s (k) = p Q (k) + ]T 



k T +1) , n _^ r d 3 k d 3 h d 3 k 



(n+l)! 



(2tt)< 



2 5 3 (k + k 1 +k 2 -k)P„(k 2 ) 



(2tt) 3 (2tt) 3 (2tt) 3 

P a (k o )P (k 1 ) - P Q (k )^(0)(2^) 3 <5 3 (k 1 ) - (n + 1) P a0 (k o )P Q0 (k 1 ) , (C5) 



where £^(0) is the mean squared line of sight velocity at a point £^(0) = |^j(v 2 ). The convolution kernel P„(k) is 
defined as an n-fold convolution of P^(k) — (27r) 3 ^(0)(5 3 (k), 

P n (k) = (2^) 3 f^L...^ [P ( qi ) - (2n)%(0) 5 3 ( qi )] • • • [P^(q n ) - (2^) 3 £ (O) J 3 (q„)] <5 3 ( qi +• • • + q„ - k) , 

(C6) 



(2tt) 3 (2tt) 3 

or equivalently the Fourier transform of the n-th power of ^(r) — £0(0): 

P„(k) = J d 3 re- lk - r (^(r)-^(0)) n 



(C7) 



2. Second Order Power Spectrum and Asymptotic Behaviour 

In order to gain some intuition into the non-linear redshift space distortions, we turn to the leading order corrections 
to the linear theory. Using Eq. (|C5[) we generate the perturbative results to second order in the power spectrum. The 
lowest order term is simply 



Wp(k)=P (k), 

the linear redshift space power spectrum that we expect. The terms at the next order are 



>P(k) = k 



P Q (k o )P (k 1 ) - P Q0 (k o )P Q ^(k 1 )] ( 5 3 (k o + ki - k) 



(C8) 



(C9) 



where at second order in our expansion n = and F n (k) = (27r) 3 <5 3 (k) giving the above. Specializing to the case of 
the matter power spectrum A = 5, and expanding out in full our result is in agreement with that of Ref. [4j when 
other non-linear effects are neglected. 

To investigate the asymptotic behaviour as k becomes large compared to the turnover in the power spectrum we 
Taylor expand the above in this limit. We must be careful to include the contributions from where either|ko| or 
|ki| are small, as we expect the integral to be dominated by contributions from around the turnover. In this series 
expansion the leading order terms in £^(0) cancel, leaving the dominant term 



'P(k) 



rf 3 q 



P^(k)P a (q) + igV( P. !qi [V„V/.7>,(k)] + />.(q) [V„V,,P.|k)] ) - 2P.,..lq) [V„P„.,(k)] ,/' 



(CIO) 
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The P Q (q) [V a V&P0(k)] term above is suppressed by a factor of {q/k) 2 relative to the other terms and so we will drop 
it from our expansion. Averaging out the angular components of the q integrals removes the summations over a and 
b and instead directly connects the k derivatives with the line of sight direction, giving 



P*M J 70P a (q) + 2m-V fe P a0 (k) J ^P Q0 ,( q) + I[v2 + 2(n.V fc ) 2 ]P Q (k) J ^^V(q) 



<2) P(k) » k\ 

I J 1^71 J ItfVI U J [ai7l 

(cn) 

Each term is of the form of the power spectrum at k (+derivatives) multiplied by a point variance coming from larger 
scales. For example the first term gives 

fc 2 P (k)£ a (O) = 4P v (k)U0), (C12) 
where the point variance of the first order source is 

d 3 q 



ao) 



(2tt) 3 



PA(q) + lPAv(q) + ^Pv(q) 



(C13) 



The other terms are more complicated, and for the approximation to make sense the integral ranges should be 
restricted to scales with |q| < |k|. The boost in power on small scales can therefore be thought of as due to the 
superposition of sources at that scale superimposed on large scale linear modes. There are terms up to the sixth 
power of jitfc. 

The behaviour on large scales again can be understood by examining the behaviour for k -C ko, k\. Expanding the 
integral for small k we have 

«p(k)«fc|Q| ^l[p A ( (7 )P„( g )-p A „( (7 ) 2 ]-p Q (k)e,(o) + ---) . (ci4) 

The first term vanishes in the case of perfect correlation between the source and the velocities, as is the case with one 
mode of linear perturbations. In this case the dominant contribution is the suppression due to the point line-of-sight 
velocity variance coming from smaller scales (given by £^(0)). In the case where the source and velocities do not 
correlate on large scales, the integral is non-zero and positive (by the Cauchy-Schwarz inequality), reducing the level 
of suppression. 
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